Add Global MMLU Lite #2567

shivalika-singh · 2024-12-13T15:28:52Z

Reopening the PR for integrating global mmlu with eval harness. I followed the instructions here and made sure pre-commit checks are passing. Hopefully the tests should pass this time.

This PR integrates the "lite" version of global mmlu which contains 200 CS (culturally sensitive) and 200 CA (culturally agnostic) samples across 15 languages with human translations. We recommend using this dataset for evaluating multilingual models and would like to integrate this with eval-harness.

This is the initial version of the PR based on our discussion here. Let me know if any changes are needed before we can merge this.

cc: @marziehf

CLAassistant · 2024-12-13T15:28:58Z

All committers have signed the CLA.

baberabb · 2024-12-13T15:41:01Z

@shivalika-singh Thanks for the PR and mostly looks good. Just a couple of nits:

please sign the CLA if you agree. We can't merge without it.
add a readme.
add an entry to tasks/README.md, with a sentence explaining the benchmark like all the other tasks.

I think we can also add a group config. Groups are similar to tags in that they both include multiple tasks, but the former also provides an aggregated metric.

shivalika-singh · 2024-12-13T16:20:06Z

Hi @baberabb , sure I'll update the readme and look into adding group config and update the PR shortly.

Regarding the CLA, I've been trying to sign the CLA since a while. I have agreed to it but somehow it's not getting reflected here. Now when I click the CLA link, it shows me "you have agreed..." and I can't see the button to accept anything anymore (as shown in screenshot).
But still not getting reflected here. I clicked on the "recheck" option quite a few times. Not sure what's the reason. Let me know in case you have any suggestions.

baberabb · 2024-12-13T23:42:17Z

@shivalika-singh Hey, so the CLA issue is because you pushed from an account different from this one you made the PR on. see: cla-assistant/cla-assistant#661 (comment)

shivalika-singh · 2024-12-17T12:11:24Z

Hi @baberabb , I have updated the readmes and signed the CLA.

I can look into adding the group config as a follow up PR later this week but would be great if we can merge this for now if it looks good. Thanks!

shivalika-singh · 2024-12-17T12:12:13Z

Regarding implementing group config, I'm thinking for this dataset it probably makes sense to have these tasks under the "global_mmlu" group:

culturally sensitive (CS)
culturally agnostic (CA)

But my understanding is that to support this, I'll have to update the dataset on hugging face as well. Right now on HF, I have 1 subset per language (ar, hi, bn, etc)
But to support group config, I should have CS & CA subsets uploaded separately for each language (i.e. ar_cs, ar_ca, etc)

Please let me know if my understanding is correct regarding this or if you'd suggest doing it a different way ? I can certainly add these changes in a follow up PR if that sounds good to you.

baberabb · 2024-12-17T13:16:40Z

Thanks for the updates!

you should be able to use process_docs to filter the rows. In your group config, for e.g. for CS, add:

process_docs: !function utils.process_docs # <file>.<function_name>

and in utils.py (same folder) you can have:

import datasets
def process_docs(df: datasets.Dataset) -> datasets.Dataset:
  return df.filter(lambda row: row["cultural_sensitivity_label"] == "CS") # according to the subset. can also use df.map()

This will apply the filter to all the task datasets when you run the benchmark (e.g. --tasks global_mmlu_cs) Alternatively, you can also add it to the individual task configs but then you will need separate configs for each (for e.g. ar_cs and ar_ca)

shivalika-singh · 2024-12-17T17:25:38Z

Hi @baberabb , updated the readme again. The failing test from previous commit should pass now.
Hope we can merge this PR now.

Thanks for explaining regarding process_docs. I'll test that and add it as an update in follow up PR shortly.

shivalikasingh95 added 3 commits December 12, 2024 12:46

add global mmlu lite

1d2ed54

add global mmlu lite

8322ca6

fix bugs

9ac1dc1

shivalika-singh requested review from baberabb and lintangsutawika as code owners December 13, 2024 15:28

shivalika-singh added 4 commits December 17, 2024 17:04

add task README.md

7d53191

Update README.md

e21c957

Update tasks README.md

7990a77

Update README.md

5919c2f

shivalika-singh added 2 commits December 17, 2024 21:08

Merge branch 'EleutherAI:main' into add_gmmlu_lite

860e55c

update readme

6eb9afd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Global MMLU Lite #2567

Add Global MMLU Lite #2567

shivalika-singh commented Dec 13, 2024

CLAassistant commented Dec 13, 2024 •

edited

Loading

baberabb commented Dec 13, 2024

shivalika-singh commented Dec 13, 2024 •

edited

Loading

baberabb commented Dec 13, 2024

shivalika-singh commented Dec 17, 2024

shivalika-singh commented Dec 17, 2024 •

edited

Loading

baberabb commented Dec 17, 2024 •

edited

Loading

shivalika-singh commented Dec 17, 2024

Add Global MMLU Lite #2567

Are you sure you want to change the base?

Add Global MMLU Lite #2567

Conversation

shivalika-singh commented Dec 13, 2024

CLAassistant commented Dec 13, 2024 • edited Loading

baberabb commented Dec 13, 2024

shivalika-singh commented Dec 13, 2024 • edited Loading

baberabb commented Dec 13, 2024

shivalika-singh commented Dec 17, 2024

shivalika-singh commented Dec 17, 2024 • edited Loading

baberabb commented Dec 17, 2024 • edited Loading

shivalika-singh commented Dec 17, 2024

CLAassistant commented Dec 13, 2024 •

edited

Loading

shivalika-singh commented Dec 13, 2024 •

edited

Loading

shivalika-singh commented Dec 17, 2024 •

edited

Loading

baberabb commented Dec 17, 2024 •

edited

Loading